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Abstract 

MLL1 is a histone H3Lys4 methyltransferase and forms a complex with WDR5 and otiier 
components. It plays important roles in developmental events, transcriptional regulation, and 
leukemogenesis. MLL1 -fusion proteins resulting from chromosomal translocations are molecular hallmarks 
of a special type of leukemia, which occurs in over 70% infant leukemia patients and often accompanies 
poor prognosis. Investigations in the past years on leukemogenesis and the MLL1-WDR5 histone H3Lys4 
methyltransferase complex demonstrate that epigenetic regulation is one of the key steps in development 
and human diseases. 
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Introduction 

Leukemia is characterized by an abnormal increase 
of white blood cells in the blood or bone marrow. Among 
all types of cancers, the morbidity of leukemia is the 
highest for the patients below 35 years old. Over 70% of 
the infant leukemia patients bear a translocation involving 
chromosome 1 1 , which results in the fusion of the mixed 
lineage leukemia gene {MLL1) with other genes'^'. MLL1 
translocations are also found in approximately 10% of 
adult acute myeloid leukemia (AML) patients, who are 
previously treated with topoisomerase II inhibitors for 
other types of cancers'^'. MLL1 is the human homologue 
of Saccharomyces cerevisiae (S. cerevisiae) gene Set1 
and Drosophila gene Trx. It encodes an enzyme 
catalyzing the methylation of histone H3 at lysine 4 
(H3Lys4) Trimethylation of histone H3Lys4 is a 
hallmark of active gene transcription, and alteration of 
this process often causes changes in gene expression 
pattern. MLL1 translocation is also linked to altered 
transcription of important genes involved in stem cell 
maintenance and development and, thus, leads to 
leukemogenesis. 
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Roles of MLLs in Development and 
Leukemogenesis 

MLLl and leukemogenesis 

The MLL1 gene was first discovered in leukemia 
patients in 1991'^'. Its cDNA contains -12 kb nucleotides 
and encodes a peptide over 4000 amino acids in length. 
In the cell, the premature MLLl protein is digested by 
taspase, which results in two peptides: a 300 kDa 
N-terminal fragment and a 170 kDa C-terminal fragment. 
The two cleaved peptides form a heterodimer, which is 
then complexed with other components, including WD 
repeat domain 5 (WDR5), retinoblastoma binding protein 
5 (RBBP5), ash2 (absent, small, or homeotic)-like 
(ASH2L), dpy-30 homolog (DPY30), multiple endocrine 
neoplasia I (MEN1), and host cell factor CI (HCFI)!^!. In 
some leukemia patients, chromosomal translocation 
results in fusion of -4.2 kb DNA of the MLL1 N-terminal 
coding region with some other genes'^'. The generation of 
MLLl fusion protein is sufficient to induce leukemia, 
which is demonstrated in animal models The 
mechanisms of MLLl fusion-mediated leukemia have 
been studied extensively in the past twenty years. 
Normal MLLl regulates the expression of homeobox A9 
{H0XA9) and meis homeobox 1 (MEIS1), which are 
essential for self-renewal of hematopoietic stem cells 
(HSCs)!"'. After HSC differentiation, the expression of 
these genes is shut down to prevent tumorigenesis. 
However, in leukemia patients, MLLl fusion proteins 
constitutively activate the expression of HOXA9 and 
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MEIS1 in differentiated cells, leading to leukemia (Figure 
1). MLL1 requires a series of binding events to 
transcribe HOXA9 successfully '^i; its cysteine-x-x- 
cysteine (CXXC) domain and plant homeodomain (PHD) 
are required to bind the polymerase-associated factor 
(PAF) complex and the histone H3Lys4 methylation site, 
respectively. Both wild-type and fusion IVILL1 exist in 
leukemia cells. Since MLL1 fusion proteins only contain 
the N-terminal fragment and lack the PHD fingers, they 
cannot activate HOXA9 expression alone. Wild-type 
IVILL1 has also been shown to be indispensable for 
HOXAS expression in leukemia cells'^'. 

Partners of MLLl 

The fusion partners of MLL1 are quite diverse. 
Indeed, more than 60 genes have been found to fuse 
with the MLL1 N-terminal coding region'^', and in some 
cases, the N-terminal coding region of the MLL1 gene is 
duplicated. As all fusions involving MLL1 can eventually 
cause leukemia, MLL1 was considered to the major 
determinant of leukemogenesis. This has been 
demonstrated by the observation that MII1 artificially 
fused with the bacterial lacZ gene is still able to induce 
leukemia in a mouse model However, compared with 
fusion proteins that occur naturally in patients, the 
MLL1-p-galactosidase fusion required a much longer 
time to induce leukemia, suggesting that the fusion 
partner genes also influence leukemogenesis. A careful 
inspection of all the partner genes indicates that the 
most pathogenic partner genes, such as ALL1 -fused 
gene from chromosome 4 protein {AF4), ALL1 -fused 
gene from chromosome 9 protein {AF9), ENL/MLL fusion 
(ENL), and eleven- nineteen lysine-rich leukemia gene 



(ELL)'^', are usually involved in transcription elongation. 
Further investigations with mouse models have 
demonstrated that these genes usually require much 
shorter time to induce leukemia compared with other 
partners'^!. 

Recently, Lin ef al. purified complexes of several 
MLL1 fusion proteins from cell extracts and analyzed 
their associated proteins by mass spectrometry. They 
found that AF4/FMR2 family, member 4 (AFF4), a 
component of the ELL-TEFb transcription elongation 
complex, was commonly associated with many types of 
MLLl fusion proteins. Depletion of AFF4 suppressed 
MLL1-fusion dependent transcription, suggesting an 
essential role for AFF4 in the function of MLL1 fusions. 
This discovery links MLL1 -dependent leukemogenesis 
with transcription elongation. It is possible that the 
acquisition of fusion partners into the MLL1 complex 
overcomes the requirement for transcription initiation 
step and results in immediate transcription elongation 
without upstream signals. 

MLLl homologues and their complex formation 

Although MLL1 has been studied extensively in 
animal models, its biochemical functions were not clear 
until Set1 , its homologue in S. cerevisiae, was 
characterized as a histone H3Lys4 methyltransferase. 
Set1 is the only enzyme in budding S. cerevisiae that 
catalyzes histone H3Lys4 methylation''''. It is capable of 
catalyzing mono-, di-, and tri-methylation of histone 
H3Lys4 in vivo and in vitro. The Set1 complex, also 
called COMPASS (complex of proteins associated with 
Set1 ), contains 6 subun its besides Set1 , including 
Cps60, Cps50, Cps40, Cps35, Cps30, and Cps25. 
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Figure 1. MLL1 fusion proteins induce ieul<emia together with the wiid-type MLLl. The wiid-type MLLl compiex pre-binds to the promoters of oncogenes 
H0XA9 and MEIS1 and is required for proper transcription activation of these genes. The MLLl fusion proteins are associated with Menin, AFF4, and other 
proteins in the ceii. The MLLl fusion connpiex constantiy activates the expression of H0XA9 and MEIS1, ieading to induction of ieul<emia. 
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All components in th e complex are extremely 
conserved from S. cerevisiae to human. Three 
homologues exist in Drosophila for S. cerevisiae Set1, 
including Set1, Trx, and TrI. In mammals, six 
homologous genes were discovered, including SETD1A 
and SETD1B for Set1, MLL1/ALL-1/HRX and 
MLL2/HRX2/WBP7 for Trx, and MLL3/KMT2C and 
MLL4/ALR for 7r/'"(Table 1 ). The gene products of these 
mammalian homologues each contain a conserved SET 
domain, which catalyzes the methylation of histone H3 at 
lysine 4. 

In past years, considerable efforts have been 
invested to characterize the complexes associated with 
the six SET1/MLL proteins!"'. Several studies show that 
the complexes are very similar and all contain several 
common components, including WDR5, RBBP5, ASH2L, 
and DPY30 (Table 1). These four proteins, conserved 
from S. cerevisiae to human, are believed to bind to 
conserved SET domains in the SET1/MLL proteins. Each 
complex contains specific subunits, suggesting that the 
biological functions of the six SET1/MLL proteins are 
different (Figure 2). This notion is supported by studies 
with animal models. Deletion of MII1 gene in mice 
caused embryonic lethality at 10.5 days whereas 
deletion of MII2 gene led to embryonic lethality at 11.5 
days'"!, suggesting that both MII1 and M//2 have important 
and non-redundant roles in development. Biochemical 
studies indicated that complexes involving the gene 
products of SETD1A and SETD1B, SET1A and SET1B, 
respectively, are the most robust trimethylating enzymes 
both in vitro and in v/Vo'^^'. Knockdown of SETD1A and 
SETD1B led to a severe reduction of global H3Lys4 
trimethylation but not mono- or dimethylation in cells. 

Methylation of histone H3Lys4 

The N-terminal tail of histone H3 is covalently 



modified in many ways, including methylation, 
acetylation, ubiquitination, phosphorylation, and others'"'. 
The fourth lysine of histone H3 can be methylated to 
mono-, di-, and tri-methylated states. Most of the histone 
H3Lys4 trimethylation occurs near the transcription start 
site of actively transcribed genes, and the level of 
methylation is well correlated with the gene transcription 
level'"'. Many studies investigating histone H3Lys4 
methylation have been published, but how histone 
H3Lys4 methylation regulates gene transcription is still 
not fully understood. Several proteins have been shown 
to bind and readout marks of histone H3Lys4 
methylation. These include chromodomain helicase DNA 
binding protein 1 (CHD1), which binds methylated 
H3Lys4 through its chromo domain, and several other 
proteins containing PHD fingers, such as inhibitor of 
growth protein 2 (ING2), inhibitor of growth protein 4 
(ING4), inhibitor of growth protein 5 (ING5), 
recombination activating gene 2 (RAG2), bromodomain 
PHD finger transcription factor (BPTF), and 
TBP-associated factor 3 (TAF3)[^'^^i. 

CHD1 is a common subunit in both the SAGA 
(Spt-Ada-Gcn5 Acetyltransferase) and SLIK (SAGA-like) 
histone acetyltransferase complexes'"'. It contains two 
chromo domains, one of which is able to recognize and 
bind the dimethylated form of histone H3Lys4. CHD1 
may bridge histone H3Lys4 dimethylation with histone 
acetylation and the subsequent gene transcription steps'"'. 
BPTF is a subunit of nucleosome remodeling factor 
(NURF), an ISWI-containing ATP-dependent chromatin- 
remodeling complex'^^^'. Its PHD finger binds trimethylated 
histone H3Lys4. BPTF deficiency causes the abnormal 
expression of hox genes in Xenopus embryos, which 
mimics the WDR5 loss-of-f unction phenotypes. The 
basal transcription factor TFIID can also recognize 
histone H3Lys4 trimethylation by its TAF3 subunit, which 



Table 1 . Complexes of mammalian histone H3Lys4 methyltransferases 
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Set1 is the only histone H3Lys4 methyitransferase in the S. cerevisiae. it has three homoiogues in Drosophiia and six in mammais. The six mammai enzymes 
have four common subunits, inciuding WDR5, RBBP5, ASH2L, and DPY30. SETD1A and SETD1B are homoiogues tor Drosophiia Setl and their specific 
subunits are WDR82 and CXXC1. MLL1 and MLL2 are homoiogues for Drosophiia Trx and their specific subunits are MEN1 and HCF1. MLL3 and MLL4 are 
homoiogues for Drosophiia Tri and their specific subunits are PTiP, PA1, UTX, and l\IC0A6. 
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Figure 2. WDRS serves as an adaptor protein in muitipie compiexes and reiated bioiogicai processes. 



also contains a PHD finger '^^i. The aforementioned 
observations suggest thiat fiistone H3Lys4 di- and 
trimethiylation marl<s crosstall< withi active gene 
transcription. Interestingly, one member of the ING 
protein family, ING2, was shown to bind trimethylated 
histone H3Lys4 on actively transcribed genes and then 
repress transcription after DNA damage'^^l This suggests 
that histone H3Lys4 trimethylation serves as a marker 
for actively transcribed genes, and that the recognition of 
this marker can lead to different results, including 
transcription activation or repression. 

Besides transcriptional regulation, histone H3Lys4 
methylation also plays important roles in chromatin 
recombination'^^'. RAG2 is an essential component of the 
RAG1/2 V(D)J recombinase, which mediates antigen- 
receptor gene assembly during B-cell development. It 
contains a PHD finger that specifically recognizes 
histone H3 trimethylated at lysine 4. In vivo, RAG2 
mutations affecting the PHD finger impaired V(D)J 
recombination. Reducing the level of histone H3Lys4 
trimethylation also led to a decrease of V (D)J 
recombination'^^!. Furthermore, a genome-wide study of 
the localization of histone H3Lys4 trimethylation in 
meiotic S. cerevisiae showed that the level of histone 
H3Lys4 trimethylation was constantly high at sites of 
programmed double-strand breaks (DSBs) that initiate 
interhomologue recombination'^'''. When the only histone 
H3Lys4 methyltransferase in S. cerevisiae, Set1 , was 



deleted, the rate of DSB at these sites was severely 
reduced. These observations suggest that H3Lys4 
methylation plays an important role in chromatin 
recombination during meiosis. 

MLLl in transcription regulation 

In the past twenty years, the mechanism by which 
MLL1 fusion proteins regulate leukemogenesis has been 
heavily investigated. In contrast, only very few studies 
have explored the physiological functions of MLLL 
Because histone H3Lys4 methylation has a global 
regulatory function in gene transcription, the six 
mammalian methyltransferases, which include MLL1, 
should have much more important roles in development 
and other physiological events. Gene knockout mouse 
models also support this idea. MIIV' mice are small at 
birth and have retarded growth''"'. Though the maturation 
of myeloid and lymphoid cell seems normal, B-cells are 
consistently reduced in number'"". These observations 
suggest that MLL1 plays roles in multiple stages in the 
development. 

An early ChlP-on-chip assay indicated that MLL1 
often co-localized with RNA polymerase II and that 
histone H3Lys4 trimethylation occurred on the promoter 
region of actively transcribed genes'^^, therefore prompting 
the conclusion that MLL1 is a common transcription 
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factor essential for gene transcription. However, otfier 
studies readied different conclusions because thie 
embryonic fibroblasts of MII1 l<nocl<out mice grew 
normally, and gene expression patterns in deficient cells 
were not significantly altered'^''^'^. Furthermore, MLL1 has 
also been reported to interact with RNA polymerase II 
and regulate the expression of only a subset of actively 
transcribed targets in vivo^^\ Genome-wide ChlP-on-chip 
assays indicated that histone H3Lys4 trimethylation was 
reduced on only -5% of the actively transcribed genes in 
MLL1 knockout cells in comparison with the wild-type 
counterparts A similar result was also observed for 
global gene expression profile. Why MLL1 protein binds 
to a major fraction of gene promoters but with limited 
functions is still unknown at this moment. 

Several recent studies indicate that MLL1 has an 
important role in cell cycle regulation. MLL1 was reported 
to be recruited to the promoter of E2F by HCF1 , a 
subunit found in MLL1 complex™. E2F is a family of 
transcription factors regulating cell proliferation by 
activating or repressing gene expression during Gi/S 
phase transition. Thus, MLL1 regulates cell cycle 
progression by controlling histone H3Lys4 trimethylation 
of the E2F promoter, which affects its expression level. 
Furthermore, during M phase, MLL1 complex has been 
found to associate with condensed chromatin on genes 
primarily expressed during interphase MLL1 is 
hypothesized to bind these genes and regulate their 
methylation and transcription immediately after M phase 
exit. Since MLL1 is important for the gene transcription 
in the cell cycle, its expression level is also tightly 
regulated. SCF^, an important E3 ubiquitin ligase for 
p27 during G^S transition, was reported to regulate 
MLL1 degradation at S phase ™. Also, APC^'^ 
ubiquitinates MLL1 at late M phase and mediates its 
stability™. The degradation of MLL1 in S and M phases 
is essential for proper cell cycle progression. 
Interestingly, some MLL1 fusion proteins are resistant to 
the E3 ligases mentioned above, suggesting another 
possible mechanism of tumorigenesis caused by MLL1 
fusion proteins. 

Functions of WDR5 

WDR5 in MLLl complex 

WDR5 is a common subunit of all six mammalian 
histone H3Lys4 methyltransferases previously 
mentioned'^^'. Its homologue in S. cerevisiae is Cps30. 
The protein level of Set1 decreases dramatically in the 
Cps30 knockout S. cerevisiae strain, probably due to the 
dissociation of the Set1 complex and subsequent protein 
degradation™. WDR5 consists of 334 amino acids and 
contains seven typical WD40 repeat domains, each 



approximately 40 amino acids ™. Structural studies 
suggest that the WD40 repeats form a seven-bladed 
propeller fold, with each blade consisting of a 
four-stranded anti-parallel sheet. This structural property 
suggests that WDR5 has many exposed surfaces which 
make it a perfect adaptor to interact with other proteins. 
In vitro pull-down assays indicate that WDR5 prefers to 
bind dimethylated histone H3Lys4 peptide™. However, 
the crystal structure of WDR5 complexed with histone 
H3 peptide does not confirm in vitro results. Quantitative 
binding analyses demonstrated that the binding of WDR5 
with histone H3 N-terminal peptides methylated to four 
different states varied only modestly. Structural studies 
of the whole MLL1-WDR5 complex may resolve 
conflicting observations between in vivo and in vitro 
experiments. 

To make things more complicated, two recent 
studies suggested a different role of WDR5 in MLL1 
complex. The Win motif, consisting of amino acid 
residues 3762-3767 next to the SET domain in MLL1 
protein, was independently discovered to mediate the 
interaction of MLL1 with WDR5'^'^^'. The crystal structure 
of the peptide with WDR5 shows that the Win motif, 
which is the analogue of H3 N-terminal peptide, binds 
with WDR5 in the central depression of the p-propeller 
by adopting a 310-helical structure and inserting Arg3765 
into the central channel. This discovery led to different 
models in the two studies. One proposed that the 
arginine binding site in WDR5 facilitates its association 
with MLLL However, the other hypothesized that binding 
of mono- and dimethylated histone H3Lys4 with WDR5 
dissociates it from the MLL1 complex and thus limits the 
methyltransferase activity. These two different models 
have yet to be validated experimentally. 

Functions of WDR5 beyond liistone metliylation 

Most of the studies of MLL1 and WDR5 focused on 
the methylation of histone H3 at the Lys4 site and the 
relationship to leukemogenesis. However, recent reports 
suggest that the functions of WDR5 are divergent 
(Figure 2). WDR5 was reported to associate with 
chromodomain helicase DNA binding protein 8 (CHD8), 
a chromatin remodeling factor and regulator of WNT 
signaling pathway'^'. CHD8 can directly interact with 
p-catenin and regulate the p-catenin-responsive genes. 
Thus, WDR5 may be involved in chromatin remodeling 
and the regulation of genes downstream of p-catenin. 

WDR5 also plays important roles in histone 
acetyltransferase complexes and the related 
physiological functions. The Drosopliila homologue of 
WDR5, Wds, was reported to be one component of 
ATAC (Ada2a-containing complex) ™, an essential 
histone acetyltransferase complex in Drosophila. This 
complex acetylates histone H4Lys16 in Drosophila 
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embryos and facilitates nucleosome sliding along DNA 
catalyzed by ISWI and SWI-SNF complexes. In human 
cells, WDR5 was also found to be associated with the 
human AT AC™. Interestingly, many other factors were 
also reported in the human ATAC, including co-factors of 
chromatin assembly/remodeling and DNA replication 
machineries (POLE3/CHRAC17 and P0LE4), stress- 
and TGF-p-activated protein kinase (TAK1/MAP3K7), 
and IVIAPS-kinase regulator (MBIP). Furthermore, both in 
human and Drosophila, WDR5 is a subunit in the IVIOF 
complex, which is the key complex regulating X 
chromosome dosage compensation by acetylating 
histone H4Lys16P='*i. 

Because WDR5 is an essential component of the 
histone methylation, acetylation, and chromatin 
remodeling complexes, WDR5 is believed to serve as an 
adaptor protein for complex assembly. However, it may 
also contribute to other physiological phenomena. 
Recently, it was reported that WDR5 is an important 
component for assembly or stability of the 
virus-induced-signaling adapter (VISA)-associated 
complex, which plays a key role in virus-triggered 
induction of type I interferons (IFNs) and anti-viral 
innate immune response '''^i. Previous studies have 
demonstrated that VISA is located at the outer 
membrane of mitochondria. Interestingly, this study 
revealed that WDR5 was not only localized in the 
nucleus as believed before, but also abundantly localized 
in the cytoplasm. Viral infection caused translocation of 
WDR5 from the nucleus to the mitochondria-located 
VISA complex, where it played a role in the assembly 
and stability of the VISA complex. This study 
demonstrates for the first time a cytoplasmic function for 
WDR5, specifically in virus-triggered signaling resulting 
in induction of type I IFNs (Figure 2)'"^'. 



The Remaining Questions 

It has been -20 years since the MLL1 gene was 
discovered and its relationship to leukemia was 
established. Many details have been elucidated on how 
MLL1 fusion proteins cause leukemia. However, it is still 
not clear why MLL1 fusion proteins function differently 
from their wild-type counterparts. It seems that wild-type 
MLL1 protein is somehow restricted and not oncogenic in 



normal cells; however, the MLL1 fusion can overcome 
these restrictions. Also, it is unclear why MLL1 fusion 
causes malignancy only in hematopoietic cells but not in 
other differentiated cells. Identification of the limiting 
factors in MLL1 -activated transcription might help to 
prevent leukemogenesis or lead to novel therapies for 
patients with leukemia. 

To design specific drugs to treat MLL1 -fusion 
dependent leukemia patients is an ultimate goal of the 
research. Based on current knowledge, small therapeutic 
chemicals can be developed following several strategies. 
The first is to target the enzymatic activity of MLL1. 
However, since MLL1 may have multiple functions in 
many other cell types, this strategy could have severe 
side effects. The second is to target the interaction 
between the CXXC domain of MLL1 and DNA or 
between MLL1 and MEN1. However, as wild-type MLL1 
also utilizes a similar mechanism, this approach might 
also have side effects. The third is to target the 
interaction between MLL1 fusion and the elongation 
factors, such as AFF4. However, this strategy still needs 
to be tested. 

Despite our growing knowledge about MLL1 fusion 
proteins and leukemia, very little is known about the 
MLL1-WDR5 complex. It is highly possible that this 
complex may have more significant roles than expected. 
Considering the importance of histone H3Lys4 
methylation in global gene transcription regulation, MLL1 
may regulate development of many tissues and organs. 
Studies of conditional knockout mice will be critical to 
investigating the functions of MLL1 in different tissues. 
The substrate specificity of the MLL1-WDR5 complex 
has emerged to be another interesting question. 
Although the histone lysine methyltransferases, unlike 
acetyltransferases, seem to have good substrate 
specificity, some, such as SET7 and SMYD2, have also 
been reported to have multiple substrates. Does 
MLL1-WDR5 complex have substrates other than 
histone H3Lys4? This is very likely as WDR5 interacts 
with a variety of proteins and even has the ability to 
translocate to the mitochondria. Undoubtedly, a new 
picture on the physiological functions of the MLL1-WDR5 
complex will soon emerge. 
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